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A diagram is introduced for visualizing matrix product states which makes transparent a con- 
nection between matrix product factorizations of states and operators, and complex weighted finite 
state automata. It is then shown how one can proceed in the opposite direction: writing an automa- 
ton that "generates" an operator gives one an immediate matrix product factorization of it. Matrix 
product factorizations have the advantage of reducing the cost of computing expectation values by 
facilitating caching of intermediate calculations. Thus our connection to complex weighted finite 
state automata yields insight into what allows for efficient caching in matrix product algorithms. 
Finally, these techniques are generalized to the case of multiple dimensions. 
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I. MOTIVATION 



Straightforward representations of quantum systems 
tend to grow exponentially with increasing system size. 
Thus, if one wants to have a hope of simulating a large 
quantum system, one needs to pick a clever method for 
representing it. Such a representation needs to have three 
properties: it must have a scaling law that makes large 
systems tractable, it needs to faithfully duplicate the 
properties of the original system, and it needs to allow 
one to compute expectation values without returning to 
the expensive representation. 

Matrix product states [l], 0, d, 0, [E[ have been popu- 
lar in the last couple of decades because they exhibit all 
three properties: they grow linearly with the size of the 
system, they tend to produce good approximations for 
many interesting systems [3], and they allow for O(N) 
calculations of expectations for tensor product operators 
(where N is the size of the system). Most operators are 
not tensor products, but one can always write them as 
a sum of tensor product operators. A general operator 
would require an exponential amount of terms to do this, 
but fortunately most operators of interest require only 
O(N) terms. Thus, the cost of computing an expecta- 
tion for a matrix product is usually 0(N 2 ). This result 
can be improved, however, by using matrix product oper- 
ators. Indeed, if one can factor an operator into a matrix 
product (in addition to factoring the state), then one can 
reduce the 0(N 2 ) calculation into a O(N) calculation. 



In practice, one can do even better than this. A typi- 
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variational method 1 . This technique involves sweeping 
through the matrices and locally optimizing each site. 
Naively, this would require 0(N 2 ) computation time at 
each site (A terms in the operator, O(N) for each term), 
but if one performs the "sweep" by moving from adjacent 
site to adjacent site, then one can cache the old results of 
computations in such a way as to achieve O(l) compu- 
tation time per site for an overall running time of O(N) 
per sweep. 

In the past, this 0(1) behavior has typically been 
achieved by writing a special caching algorithm for each 
Hamiltonian [3j . However, by writing a caching code that 
works with matrix product operators, one can achieve 
this in a general way for all Hamiltonians; one need only 
supply as input a matrix product factorization of an op- 
erator. Thus we see that it is incredibly useful to be 
able to write down a matrix factorization for operators 
of interest. 

In this paper, we shall present techniques that help 
simplify and clarify the construction of matrix product 
operators. Our path towards this simplification pro- 
ceeds by showing that matrix product states and matrix 
product operators can be thought of as finite complex 
weighted automata. This equivalence allows us to recast 
the problem of supplying a matrix product factorization 
for an operator as the problem of constructing a com- 
plex weighted finite automaton. The correspondence be- 
tween matrix product operators and complex weighted fi- 
nite automata opens up the door for applying techniques 
across the usually disparate subjects of matrix product 
algorithms and finite automata. Thus, for example, op- 
erations defined on finite automata, which are regularly 
used to construct and understand finite automata, can 
now be applied to matrix product operators. 

The connection we establish between matrix product 
states and complex weighted finite state automata can be 
viewed as a formalization of the intuition behind much 
of the language used to describe these states. For ex- 
ample, finitely correlated states Q, an early version of 
matrix product states, were first conceived of by think- 
ing about the tensor index connecting different matrices 
in matrix product states as a memory used in construct- 
ing, one subsystem at a time, a quantum state. Similarly, 
much language used to describe matrix product operators 
speaks of the tensor index connecting different matrices 
as a signal or correlation between adjacent subsystems. 
One of our contributions in this paper is to point out that 
these intuitions can be formalized in that matrix prod- 
uct states can be thought of as complex weighted finite 
automata and that this view extends to matrix product 
operators. This latter property allows for us to engineer 
matrix product operators designing finite automata and 



This idea was originally proposed by Ostlund and RommerQ. 
It was inspired, however, by the DMRG algorithm (originally 
proposed by White 0]) which has proven to be a very effective 
means of finding quantum ground states. 



to apply the techniques and methods of finite automata 
in this process. 

A review of the paper is as follows. We begin with 
some background material that presents a pedagogical 
introduction to matrix product states. Then we will in- 
troduce the key to our method: a new type of diagram 
which allows one to visualize matrix product states in a 
way that makes transparent the type of state that they 
generate. Although the application of these diagrams to 
matrix product states is novel, the diagrams themselves 
are not: rather, they will be shown to be merely vari- 
ants of complex weighted finite state automata. Once 
this connection has been made, it will be shown how one 
can obtain a matrix product factorization of a state or 
operator by starting with an automaton that generates 
the "pattern" of the operator, and then translating this 
automaton into a set of matrix factors. It will then be 
shown how this process generalizes to multiple dimen- 
sions, where the automata connection is particularly in- 
sightful. 



II. ONE-DIMENSIONAL CHAINS 
A. Background 

Consider a quantum system with A independent ob- 
servables, such as the Z spin components of a linear one- 
dimensional chain of spin-^ particles. In general, the 
representation of this system must be expressed as a ten- 
sor with A indices, ■Ai 1 ,i 3 ,...,i N - Each element of this 
tensor represents the amplitude of a particular system 
configuration; for example ^4j.-f|i gi yes the amplitude of 
a particular system of four particles being in the J.TT-L 
state. Part of the difficulty in simulating quantum me- 
chanical systems arises from the fact that when one adds 
another particle to a system, one must add another index 
to the representing tensor. Thus, the information needed 
to represent a quantum state in general grows exponen- 
tially with the number of particles. 

Fortunately, it turns out that not all quantum states 
require the full content of an A-index tensor. Some 
states are special in that they are separable, which means 
that their A-index tensor can be factored into the outer- 
product of A one-index tensors, 

^ Q( a 7 ... = A a BpCy ■ ■ ■ (1) 

This representation is very nice because it grows only 
linearly with the number of observables; since it is so nice, 
in fact, it is not surprising that it comes with a price: it 
cannot be used to model systems with any entanglement. 

It would be nice to be able to add some entanglement 
into the above representation in such a way that we do 
not cause it to revert back to the full A-index tensor. For 
example, suppose that the observables corresponding to 
indices a and (3 in ([T]) were maximally entangled - i.e., 
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their state is given by 



ITT) + III) 



1 o 
o 1 



where the matrix elements correspond to configuration 
amplitudes as shown in the following table: 





a =1 


a =| 




=1 


1 








=T 





1 



(Note that we are not normalizing our states; this will 
not be a concern for the purpose of our discussion.) 

There is no way to obtain the above matrix by taking 
the outer-product of two vectors - A a and Bp in {TJ) - 
but one could obtain it by taking the inner product of 
two matrices, such as 



1 

1 



B 



i/8) 



E 



A a iBif3 — 



1 
1 



Thus, we see that we could represent our state in the 
form, 



aiBifiC^y 



This had the desired result - we were able to add 
a small amount of entanglement to our separable state 
without greatly enlarging it. The inner index i can be 
thought of as a "bond" between two of the particles that 
allows them to communicate to each other. 

If one wished, one could put a bond between all the 
particles in the (linear one-dimensional) chain, 



^ap-yS— ~ ^ AgjBjpjCjjkDkS 



(2) 



at which point one would obtain what is called a "matrix 
product state". 

This method is not limited to representations of states; 
it is also possible to likewise factor operators into so- 
called "matrix product operators" [6| , 



(aP"/6— )(a'P''y'S'— ) 



Bipp'jCj^j'kDkSS' 



Matrix product states have gained much interest in the 
last decade because they turn out to have entanglement 
properties that are sufficient to represent many systems 
of interest. Furthermore, they are very flexible: one can 
add additional inner indices whenever one wants to intro- 
duce entanglement between two particles, forming tensor 
networks that can represent systems in any number of 
dimensions and with any lattice structure. 

In this paper, it will prove useful to distinguish be- 
tween two types of indices: the indices being summed 



over, which correspond to entanglement introduced be- 
tween observables, and those not, which correspond to 
the observables themselves. Thus, the former will be de- 
noted by subscripts and the latter will be denoted by 
superscripts; for example, ([2]) should appear in the form, 



i,j,k 



B. Matrix product diagram 

Consider the four-particle W state 

I*) = IITTT) + IUTT) + lTUT) + irTU), 

which is a sum over all possible states in which one and 
only one particle has spin-down. A matrix product rep- 
resentation of this state is 



where a is the index of the spin component of the first 
particle, (3 is the index of the second particle, etc., and 
the tensors on the right-hand side are given by 



A ] = l , A 1 = 



1 , 



1 
f 



1 









, D l = 


1 




1 








Alternatively, one may use the following notation. In- 
stead of writing a separate matrix for each value of the 
superscript indices, instead label each matrix element by 
a value of the observable. Furthermore, adopt the con- 
vention that when taking the inner-product between ma- 
trices, one should multiply the matrix elements together 
by using the outer-product. This allows us to express 
our state in the more compact (and hopefully transpar- 
ent) form, 



4< = 



T I 



T I 
o t 



T I 
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o t 




T 


c 


D 



(3) 



To illustrate that this factorization of our state works, 
we step through the multiplication of the matrices, start- 
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ing with the two on the right: 



T I 



T I 




T I 



TTI + TIT + ITT 
TTT 

=TTTI + TTIT + TITT + ITTT 



This factorization may be equivalently expressed in the 
form of the diagram in figure lla[ which was obtained 
directly from the matrices in ([3]). The nodes correspond 
to indices, and the edges correspond to matrix elements. 
The matrices were treated as a table of weights for edges 
connecting each set of nodes. That is, for each 2x2 
matrix M^- , the elements were mapped to edges as shown 
in figure [Tel Where a matrix edge was zero, the edge 
was omitted. The nodes shared between edges indicate 
common indices being summed over. 

The arrows place an ordering on the indices. They are 
not strictly necessary to define the diagrams, but they 
are useful because they allow one to view the diagram 
in terms of paths. Specifically, each choice of indices 
corresponds to a "walk" from the left side of the diagram 
to the right. For example, the choice i = 0,j = 0,k = 1 
corresponds to the walk as shown in Fig. [TbJ 

Each possible walk from the left to the right generates 
a term in our sum, so that the walk shown in Fig. Ilbl gen- 
erates the term TTIT- As discussed earlier, edges which 
do not appear in the diagram correspond to vanishing 
matrix elements; this may be thought of as disallowing a 
walk between certain nodes, as any term which tries to 
include nonexistent edges is multiplied by zero and thus 
does not contribute to the sum. For example, in figure 
Hal note that there is no path that returns to the top 
from the bottom (such as i = 0, j = 1, k = 0). 




(a) Diagram representing the matrix product form of the 
"W"-state. Each possible "walk" from left to right generates a term 
in the state, as illustrated in llbl 




(b) An example walk through the matrix product state illustrated 
in Hal which generates the term TT4-T- 




(c) Mapping of edges to matrix elements. 

FIG. 1: Matrix product diagrams 




Extension to operators 



FIG. 2: Matrix product diagram for a magnetic field 
operator. 



We are not restricted to labeling edges of matrix prod- 
uct diagrams with states; the tensors at each site may 
be objects with any number of superscript indices. This 
allows us to factor operators as well as states, resulting 
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in a matrix product operator 2 , which recall has the form 

^(apyS... )(a' p'i'S' ... ) _ j^ota' gPP' jjSS' 

/ < i ij jk k 

For example, if we were to take the matrix product 
representation for the W state, as given in ([3]), and re- 
place t with the identity matrix and J, with the Pauli Z 
spin matrix (which wewill denote by Z) then we would 
obtain the diagram in figure [5J 

This operator represents the action of a magnetic field 
in the z direction coupling to each of the particles. 

C. Weighted finite automata states 



2. E is a finite alphabet; in our case, we shall let 
E = {0, 1} for the two possible values of the z com- 
ponent of spin 

3. W:QxExQ^Cis the weight junction; we 
may equivalently represent this function as a set of 
complex Q x Q matrices, W a for each symbol a in 
our alphabet E 

4. a : 1 x Q is the (complex-valued) initial distribution 

5. Q : Q x 1 is the (complex- valued) final distribution 

For a string a^ai . . .a^ 6 E*, the output of our au- 
tomaton is defined to be 



Up to this point we have considered our state to be 
an TV— dimensional tensor, where N is the number of 
observables. Let us now use a different but equivalent 
description that is applicable when all of our observables 
are of the same kind (e.g., z components of spin). First, 
we define a set E to be our "alphabet" ; it contains all 
the possible values for our observable. For example, for 
a spin-i chain, we shall choose E :— {0, 1} where la- 
bels the spin-up state and 1 labels the spin-down state. 
Then we may describe our state as a function that maps 
strings of length N of the alphabet E to complex num- 
bers: / : E w -> C. 

We can generalize this function. Suppose that the size 
of our system is itself a variable - that is, we want to 
consider systems with 1 particle, 2 particles, etc., and to 
have the descriptions for all of these systems captured in 
a single function. Then we can make our function a map 
not from E w , but from E*, the set of all finite-length 
strings of E symbols. 

When phrased in this form, it can be shown that say- 
ing that our state has a matrix product representation 
is equivalent to saying that the function / can be com- 
puted by a special kind of weighted finite automaton. A 
complex-weighted finite automaton 3 is defined by a 5- 
tuple, (Q, E, W, a, f2), where 

1. Q is a finite set of states 



f(a ai . . .a N ) 



a ■ W ao ■ W ai 



(4) 



Matrix product operators were originally introduced by Ver- 
straete, Garcia-Ripoll and Cirac [U, but they were used as den- 
sity operators — i.e., as representations of states, rather than of 
Hamiltonians or other physical operators. McCulloch, however, 
later showed how many classes of physical operators can be fac- 
tored, and discussed why it can be useful to write them in this 
form fiot . 

Complex-weighted automata are a generalization of real- 
weighted finite automata, which were originally introduced by 
Culik and Kari as a technique for compressing grayscale images 
| 111 [T3 . 13] . It is worth noting that Latorre devised a very simi- 
lar algorithm for image compression motivated by matrix prod- 
uct states, though without making the connection to finite state 
automata |14| . This is interesting because it shows how the sep- 
arate fields of quantum physics and computer science have inde- 
pendently converged to the same idea. 



A finite state automaton can be thought of as a ma- 
chine which moves from one one state to another based 
on an input signal. To see a simple example of this, we 
consider a simplification of a weighted finite automata 
called a deterministic finite automaton, which outputs 
either ("reject") or 1 ("accept"). The values of all ma- 
trices - W a , a, and fl - are restricted to be either 1 or 
0, and there may only be one non-zero matrix element 
of each row of each matrix. For example, the following 
machine accepts (or "recognizes" ) all strings that end in 
either two 0's or two l's and rejects all others: 

1. Q := {A,B,C,D,E} 

2. E:= {0,1} 
3. 



W n := 



1 





1 

1 





1 
1 





Wi := 



1 

1 

1 

1 

1 



4. a — 



5. Q 



1 



While this is the canonical form, it is not very trans- 
parent. The diagram in figure [3] is an equivalent method 
of defining this automaton. The unconnected arrow on 
the far left indicates that the system should start in state 
A. C and E are shaded to indicate that they are the states 
that the machine "accepts" - that is, the machine out- 
puts one (as the value for /, not to be confused with the 
symbol 1 in the alphabet) if and only if it is currently 
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FIG. 3: (Color online) Example finite state automaton 
which recognizes any string that ends in either two O's 
or two l's. 



in such a state at the end of a string; otherwise, it out- 
puts zero. The arrows indicate how the machine should 
transition between states in response to a symbol; for ex- 
ample, the machine will move from state A to state B if 
the first symbol is a 0, and to state D if the first symbol 
is a f . 

This machine works by using states B through E as 
a sort of "memory". States B and D are used for the 
machine to remember that the last symbol it saw (re- 
spectively a or a f) was different from the one before 
it; states C and E are used for the machine to remember 
that it has already seen two symbols of a kind. 

Note that for each state there is one and only one tran- 
sition for each symbol, and this transition has weight 1; 
this is due to our restriction that our machine had to be 
a deterministic finite automaton. Removal of this restric- 
tion allows us to have zero or multiple transitions for each 
symbol at each state, and also to give a weight to each 
transition. Because of this, each input string can have 
multiple paths, or even no paths through our diagram; 
for each possible path we associate a weight equal to the 
product of all the weights along the path; furthermore, 
there may be more than one initial state, and each ini- 
tial state and final state may itself have a weight. The 
output of our automaton is the sum of all weights of all 
paths from all inital to all final states. This procedure is 
not an extension of our definition of a weighted finite au- 
tomaton, but rather a restatement of it, as it is implicit 
in gj). 

Also, note that in this light a matrix product state can 
be seen as just a special case of a weighted finite automa- 
ton. Each of the nodes on the diagrams drawn earlier is a 
state, with the edges between them labeling transitions. 
In a matrix product state, however, there is a separate 
transition matrix for each position in the string; that is, 
the third symbol always passes through the same region 
on the graph, and no other symbol passes through this 
same region, whereas in a finite state automaton all states 
are potentially accessible to all symbols. (Equivalently, 
one could say that a matrix product state is a weighted 
finite automaton in which all the transition matrices are 




(a) Finite state automaton recognizing the W state. 




(b) Finite state automaton recognizing state with neighboring 

l's. 



FIG. 4: (Color online) Examples of finite state 
automata which recognize quantum states 

block diagonal.) 

The ability to share states allows one to write down 
very compact representations of states in weighted finite 
automaton form. For example, the W state can expressed 
as an automaton with only two states, as shown in figure 

M 

Again, observe that our states act as a form of memory. 
When the machine is in state A, it has not yet seen a 1. 
When the machine is in state B, it has already seen a 1. 
Upon seeing a 1, it either transitions from A to B, or dies 
if it is already in B (i.e., outputs for the state). With 
this manner of thinking, it is easy to see how to extend 
this machine to output the state 

110000 • • • + 011000 • • ■ + 001100 •■• + ..., 

that is, the set of states with two neighboring l's; we 
already have a state, B, which indicates that the machine 
has seen one 1, so all we have to do is add another state, 
C, which indicates that it has seen two l's. We also have 
to update the transitions so that the machine dies unless 
the two states are neighbors. The result is shown in figure 

m 

Just as it is possible to view a matrix product state 
as a special case of a weighted finite state automaton, 
it is always possible to construct a matrix product state 
from a weighted finite automaton. To do this, one cre- 
ates a copy of all the states of the automaton for each 
particle in the system, and remaps the transitions so that 
one is always moving from one set of states to another. 
Finally, one removes all but the initial transition in the 
leftmost set of states and all but the final transition in 
the rightmost set of states. For example, for the state 
just described the matrix product representation would 
be as shown in figure where the faded states and edges 
indicate that they have been removed. 
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At this point, note that we have obtained a factoriza- 
tion of the nearest neighbor coupling operator, 

XXII + IXXI + IIXX. 

To see this, we observe that this has the same pattern 
as the state 

1100 + 0110 + 0011, 

which is what we just factored above. Using the same 
form for the diagram, we see that the matrix factorization 
is 



I 


X 
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X 
















X 










X 




X 








I 










I 




I 



Thus, we see that we now have a method for factoring 
operators: 

1. Write down a weighted finite automata which gen- 
erates the pattern of the operator. 

2. Translate this into a matrix product operator dia- 
gram. 

3. Write down matrices based on the diagram. 

This method is most efficient for operators that are 
translationally invariant. If an operator has additional 
position-dependent structure, then one should incorpo- 
rate this structure into the matrix product diagrams, 
rather than into the weighted finite automata. To see 
what is meant by this, suppose our coupling operator 
took the peculiar form, 

XXII + IXZI + IIXX. 

It is still possible to write down a weighted finite au- 
tomata for this operator, as shown in figure I5al However, 




(b) 

FIG. 6: (Color online) An example of what happens to 
an automaton (fig. I6a[) and a matrix product state (fig. 
I6b[) when translational invariance is lost. 



as you can see, capturing this position-dependent behav- 
ior requires the addition of several states, which means 
that our matrix factors would have to be much larger. 
Thus, rather than proceeding in this way, it would be 
better to note that this operator looks almost like the 
previous operator except with a Z in a special place, and 
then proceed by modifying the previous diagram in that 
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FIG. 7: An example of using box-and-line notation to 
represent a network of tensors. 



single spot to obtain the diagram shown in I6b| which 
corresponds to the factorization, 







I X 




I X 










I X 




X 







z 




X 






1 







I 




I 



Thus we see that when encoding translationally invari- 
ant behavior in an operator it is better to work with 
weighted finite automata, and when encoding position 
dependent behaviour it is better to work with matrix 
product diagrams. Of particular significance of this result 
is that natural operations on finite automata, like con- 
structing unions, concatenations, intersections of their 
languages, can now be put to use in constructing ma- 
trix product operators of increasing complexity. Re- 
cently methods for engineering complex quantum sys- 
tems, with effective Hamiltonians which are extremely 
complicated have become important for quantum compu- 
tation [IH Il6l. [l7]] . Finding matrix product factorizations 
for these Hamiltonians is a non-trivial task. However, the 
above finite automaton picture brings to bear a new set 
of tools for obtaining such factorizations. 
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(a) Matrix product state by itself 
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(b) Expectation of an arbitrary operator 




(c) Expectation of a matrix product operator. 

FIG. 8: Box-and-line notation applied to matrix 
product states. 



D. Calculation of expectations 

In the proceeding section we have demonstrated a 
method for obtaining a factorization of an operator by 
thinking about these operators as a complex weighted 
finite automaton. In this section, we shall show how a 
matrix factorization of an operator allows us to compute 
expectations of matrix product states efficiently. 

We shall use "box-and-line" notation to represent a 
network of tensors that is being partly or fully contracted, 



so for example the tensor network 

* — } j A % t5 m U kl U- l 

ijkl 

is represented by the diagram shown in figure [71 where 
the boxes represent tensors, edges connecting boxes rep- 
resent summed (internal) indices, and edges with arrows 
indicate external indices. 

With this notation, we see that the matrix product 
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FIG. 9: Building transfer matrices for the expectation 
of a matrix product operator. 



state given by 



E 

ijk 



(Si)?(S 3 )S(S3, 3 



fe (^ 



is represented by the diagram in figure [Sal and expec- 
tation value of this state with respect to some operator 
0( a '0'Y6'), (a/3 7 5) is given by the diagram in figure [8b] If 
this were the best we could do, then the matrix prod- 
uct state would not have helped us very much because 
we would still need to perform an exponential amount of 
calculations. Fortunately, we can improve upon this if 
we can factor 6 into matrix product form, 



(a'p'-f'S'), (a/375) _ 
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>,j»,k> 



(00^(02)^(03)^(04)^. 
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FIG. 10: Formation of a matrix that allows us to 
express the expectation of in quadratic form with 
respect to site 3. 



and then we contract the new network. 4 This procedure 
takes 0(3N) matrix multiplications, and the largest ma- 
trices that we ever need to form are the E n matrices. 
Thus we see that computing expectation values for a ma- 
trix product operator is an O(N) procedure 5 . 



E. Energy minimization and caching 

Up to now, we have discussed how to perform opera- 
tions on matrix product states that are known. In gen- 
eral, however, one will want to investigate systems for 
which the eigenstates are unknown. In this case, ma- 
trix product states provide an ansatz for the variational 
method. That is, one assumes that a ground state has a 
particular matrix product form, and then searches for the 
matrix elements which give the lowest energy state rep- 
resentable in that form; put another way, one seeks the 
matrix product state 5^ that minimizes the normalized 
expectation value of the Hamiltonian 



(y\H\y) 



The hope is that the result of this process will be a rea- 
sonable approximation to the true ground state. 

In general finding t he g lobal minimizer of / is an 
NP-complete 6 problem Fortunately, a local search 
heuristic suffices for many systems of interest: at each 



so that our tensor network now becomes that shown in 
figure [8c] 

This sum is now performed in two stages; first, we 
sum the site and operator matrix at each index to form 
"transfer matrices" , 

W^WwrErt (o 1 )t(s 1 )-,... 

thus forming the new tensor network shown in figure [9] 



4 We note here that there is an alternative viewpoint of matrix 
product states which considers the transfer matrices themselves 
to be the primary object of interest. So-called "finitely correlated 
states" [8fl are characterized (somewhat abstractly) by a map 
E : A <8> B — » B, which in essence produces transfer matrices 
(tensors in some finite-dimensional tensor space B — > B) from 
observables (operators in a C*-algebra A). 

5 Upon completion of this work, we learned of similar results by 
Murg et al. [H |. 

6 NP-complete problems are the hardest problems in the complex- 
ity class NP (nondeterministic polynomial time) and are widely 
suspected to be computationally intractable. 
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step in the minimization process, all but one of the site 
matrices are held constant, and then the energy is mini- 
mized with respect to the single remaining matrix. 

Suppose we are varying over the third matrix. Recall 
that the expectation of an operator can be represented 
by a diagram of the form shown in figure [HI since we are 
holding all but the third matrix constant, we can form 
the matrix 6 3 which is the contraction of all the tensors 
in the network save S 3 and its conjugate, as shown in 
figure |TQJ We see now that the energy as a function of 
S 3 is just the quadratic ratio form 



f(S 3 ) 



S* 3 ■ 3V 3 ■ S 3 



S£ ■ jV 3 



where JV 3 and ,jV 3 are the aforementioned tensor con- 
tractions for & — and & = I respectively. It can be 
shown that minimizing the above form (a Rayleigh quo- 
tient) is equivalent to solving the generalized eigenvalue 
problem 

jr 3 ■ s 3 = \j/ 3 ■ s 3 , 

and then picking the eigenvector S 3 with the smallest 
value for A. The cost of solving this eigenvalue problem 
depends only on the size of the site matrix S" 3 , not on the 
number of sites, N; however, it also relies on Jf? 3 , which 
in general is very expensive to calculate. 

Fortunately, if we can factor Jf? into a matrix product 
operator, then computing Jif 3 is cheap. As in the pre- 
vious section, we observe that we may form Ei matrices 
by contracting S* , Oi, and Si together at each site; fur- 
thermore, we may also contract all of the Ei matrices to 
the left of site 3 to form L 3 and all of the Ei matrices to 
the right of site 3 to form R 3 . The result of this is the 
form shown in figure 1111 

Computing Li and Ri at some site i would naively be 
an O(N) operation; however, by using caching we can 
instead make it an amortized 7 0(1) operation. To see 
why, note that Li and Ri may be computed recursively: 



Li = 
Rn 



I, Li = 
= 1, Ri 



Li-i ■ Ei-i, 

= Ri+l ■ Ei+1 



So once we have Ri we already have i?2 through Rjy. 
Thus, if we start by minimizing the energy with respect 
to site 1, and then sweep to the right (i.e., site 2, site 
3, up to site N), then although it took us O(N) time to 
compute Ri, we get the Ri matrices for all of the rest 
of the sites up through N for free. Once we hit site N, 
we start moving left back through N — 1, N — 2, etc., 
and at each step it only takes us one additional matrix 



7 By "amortized" here we mean that although it takes O(N) time 
to initialize L\ and i?i at the first step, it takes O(l) for all 
remaining steps, and there are typically at least N steps, so on 
average the operation takes O(l) time per step. 



S* 
1 



Hi 



as 



'3i~ 



H: 



H 



'3P 



H, 





FIG. 11: Tensor contractions used to compute L$ and 

i?3 



multiplication to compute Ri from Ri+\. Thus, the time 
needed at each step to compute Ri is amortized O(l); by 
a similar argument, we see the same for the Li matrices. 
This process is illustrated in figure [T2l 

The notion of using caching to speed up these calcula- 
tions is not a new one; the same process has already been 
described by Verstraete, Porras, and Cirac [3j]. However, 
whereas their process is limited to one- and two-body 
operators, our procedure works for any form of operator 
that can be written in matrix product form; furthermore, 
the process is the same for all such operators, rather than 
requiring a new process for each class of operator - e.g., 
one-body, two-body, etc. 

To summarize: if we can factor our Hamiltonian into 
a matrix product operator, then we can calculate our 
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E 1 


L 2 




E 2 


L, 




-0 3 - 




s 3 




L 2 




E 2 






E 3 



L, 



0, 



FIG. 12: Use of recursion and caching to calculate L 
and R in amortized 0(1) time at each site. 



matrices J$f l and JV % needed to optimize a site matrix 
from matrices that can be calculated using a recursion 
rule. By caching the intermediate steps of the recursion, 
and moving through the sites to be optimized in order 
from left to right and back, we can calculate ^V 31 and 
JY 1 in amortized 0(1) time. 



III. ARBITRARY-DIMENSIONAL GRAPH 
A. Motivation 

Matrix product states were designed for studying one- 
dimensional systems, and that is where they excel. Noth- 
ing technically stops one from using them to study 
higher-dimensional systems, however - as long as one 
is willing to effectively reduce these systems into a one- 
dimensional system by imposing an ordering; for exam- 
ple, on a 6 x 6 two-dimensional grid one could impose the 
ordering shown in figure [T3"al 

However, there is a price one pays for doing this. Sup- 
pose that one wants to represent a Hamiltonian on a 
6x6 grid which consists of four-site X terms arranged 
in a square as shown in Fig. I13bl The automata which 
encodes such a Hamiltonian takes the form shown in Fig. 
113d The states in the middle (D-G) act as a memory 
which tells the automata how many sites it has walked 



"V 



— ■ — > — ■ — s> — ■ — > — ■ — >~ 



-5> — ■ — ? — ■ — 5 — ■ — > — ■ — >- 



(a) Total ordering imposed on two-dimensional 
grid 



• • 


• • • • 


• X 


X • • • 


X 


X • • • 



(b) 4-X operator on grid. The X's represent 
the locations of the X operators on the grid; at 
all other sites there are / operators. 




(c) Finite state automaton needed for operator in I13bl given the 
total ordering shown in !13al 



FIG. 13: (Color online) A total ordering is imposed on 



the two-dimensional grid in ll3a( this allows us to write 
down a finite state automaton representation of the A-X 
operator shown in ll3bl but the resulting automaton in 
I13cl is less than ideal. 
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past since the second X. This is needed so that the au- 
tomaton can put the last two X's in the correct place on 
the following row. The number of states required here 
grows with the number of columns in the grid. 

We see that although we can write down such an au- 
tomata, and so form a matrix product representation of 
this operator, it is less than ideal because the represen- 
tation depends on the size of the grid. This comes from 
the fact that information can only flow in one direction; 
ideally, we would like the information that an X has been 
placed on one row to somehow go directly down one row 
rather than having to sweep through the rest of the cur- 
rent row first. We could, of course, adjust the ordering to 
sweep down columns instead of across rows, but then we 
lose the ability to cheaply send information across a row; 
using matrix product states, there is no way we can make 
it easy to communicate in two directions simultaneously. 



B. Tensor network diagrams 

The previous section has described the limitations of 
matrix product states. These limitations come from the 
fact that each tensor is connected by indices to only two 
other tensors. (Or equivalently, we might say that the 
problem is that each site is directly entangled with only 
two other sites.) We can get a more powerful representa- 
tional form, as for example was done using the concept 
of projected entangled pairs in [2(| HH, by using a more 
complicated index structure; for example, we could use 
the following structure: 



ijkl 



(5) 



In this example, we see that there are many different 
types of factors that are possible. The first, A a , is a 
simple outer-product factor; this indicates that there is 
no entanglement between a and any of the other observ- 
ables. The second two tensors, i?f and CZ, are connected 
by an inner product - i.e., a sum over the subscript index 
i - and so we see that (3 and 7 have some entanglement 
between them. 7 is also entangled with S and /j through 
the index j and v through the index fc; this illustrates 
that entanglement may be shared between one observ- 
able and any number of others, and that those other ob- 
servables need not be directly adjacent to a factor in the 
above. Note that the factor D does not have a super- 
script; it does not give any direct information about an 
observable, but rather (in a manner of speaking) it co- 
ordinates communication between observables. On the 
other hand, E^[ L has two superscripts, so that it gives 
information about two observables at once. Finally, note 
that the index I is shared between three tensors; putting 
the same index in multiple places allows several observ- 
ables to be simultaneously entangled with each other. 

One possible diagram with the above tensor structure 
is that shown in Fig. I14al Just as in the one-dimensional 
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(a) A diagram representing the tensor product structure of eqn. 




(b) An invalid walk through I14al 




(c) An acceptable walk through [Ma] 

FIG. 14: Tensor product state diagrams 



matrix product states, we may put arrowheads on the 
edges, and think of our states as being generated by 
walks through the diagram. However, now there are 
points where our walk may split into many paths (in this 
case, the indices j and k) which are taken simultaneously. 
Whenever these two paths rejoin, they must rejoin at the 
same node or else the walk is rejected. For example, Fig. 
I14bl illustrates an invalid choice of path, whereas Fig. I14cl 
illustrates a correct choice. 

This rule for the rejoining of paths is just a restatement 
of the fact that only one value may be picked for each 
index for each term in the sum. 

Note that although there is a partial ordering on the 
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steps in our walk, there is not a total ordering. That is, 
although our arrowheads tell us that a link from C must 
be chosen before a link from E or F, they do not tell 
us whether a link from E should be chosen before F or 
vice versa. This contrasts with the one-dimensional case, 
where there is a total ordering. 



C. Weighted finite signaling agents 

Recall that in HTCl we showed that matrix product 
diagrams are equivalent to defining a weighted hnite au- 
tomata which encodes the state. If we wanted, we could 
similarly relate our generalized tensor network states to 
weighted hnite automata. There is a catch, though: au- 
tomata require the system to be in a concrete "state" at 
any moment in time, and they also require there to be 
a total ordering of the input. Our tensor network state 
diagrams have neither of these properties. 

To see why these properties are absent, we return to 
figure I14al For the B transition, the system picks one 
of the states in i in response to the first input symbol, 
so both properties hold. For the C transition, however, 
the system picks two new states for the system - one 
from j and one from k - in response to the second input 
symbol. At this point, not only is the system in multiple 
states, but the symbol to which it will respond next is not 
defined, since the order of E and F has not been specified. 

Of course, we could force these properties to be present 
by combining indices j and k into a single index m that 
unifies them, (i.e., m — 1 would be equivalent to j = 

1, k = 1; m = 2 would be equivalent to j = l,fc = 

2, etc.) We would then have to replace our tensors E 
and F with a single tensor G. It is possible that in this 
particular situation we would obtain something simpler, 
but in general this process will result in much larger and 
more complicated tensors than we started with. 

Thus, instead of thinking in terms of states, it proves 
more useful to think in terms of signals. Each site cor- 
responds to an agent that receives signals from channels, 
makes a (nondeterministic, weighted) decision based on 
these incoming signals and an input symbol, and then 
sends signals to output channels, i, j, k, I, and the rest 
of the unlabeled nodes above correspond to our channels, 
and B — F correspond to our agents. 

To see how this works in practice, we consider how one 
might design an agent to generate the 4- A Hamiltonian 
(fig. Il3b[) discussed in section Ull Al We allow our agents 
to receive signals from two directions: up and left, and to 
send signals in two directions: down and right; the flow 
of information is illustrated in figure [T5l 

The grey circles represent sites (or "agents") in our 
system, and the arrows represent links (or "channels"). 
At each site is an agent, which is a rank 5 tensor with 
an index corresponding to each channel and an index 
corresponding to the input symbol. The (slightly grayed) 
arrows on the outside of the diagram that connect to 
only one node implement the boundary conditions; they 
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FIG. 15: Flow of information for finite signaling agent 
on two-dimensional grid 



do this by starting the system with a particular set of 
"initial" signals, sent through the top and left boundary 
channels, and then accepting only those inputs that cause 
the signal received from the bottom and right boundary 
channels to be one of the valid "final" signals. 

Each signal is an integer that corresponds to an in- 
dex in a tensor; it often proves convenient, though, to 
map these integers to names in order to make clear the 
working of the agent. For example, table |Ta| gives the 
signal names that we use for the agent recognizing our 
4- A Hamiltonian. 

We define an agent by specifying how it reacts to in- 
coming signals. In this case, there are two incoming sig- 
nals: one from above, and one from the left. Since the 
agent is nondeterministic, it can have several possible re- 
actions to the incoming signals, each corresponding to 
a symbol it recognizes (or generates) and signals that it 
sends right and down. For our 4-X Hamiltonian, our 
agent takes the form defined in table ITbl 

To see what is going on, consider figure [T6a] which il- 
lustrates the agent accepting four X's on the grid. The 
background at each point is shaded to indicate which of 
the above transitions is taking place at that point. Note 
that the grid is divided into three general regions: the ex- 
terior (shaded white), the boundary (shaded dark grey), 
and the interior (shaded light grey). Inside the bound- 
ary and the interior, As are excluded because there is 
no transition that includes them. The boundary has the 
role of forbidding additional squares of Xs from being 
accepted in the exterior, since we have chosen our transi- 
tions so that boundaries can only be continued in one di- 
rection (vertical or horizontal). As figure [TFbl illustrates, 
a second group of As which is in the exterior of the first 
group results in an intersecting boundary that causes the 
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Name 


Index Number 


Exterior 





Boundary w/ X 


1 


Boundary 


2 


Interior w/ X 


3 


Interior 


4 



(a) For convenience, labels are 
assigned to index numbers in order 
to give intuitive names to the 
signals. 



Input Signals 



Symbol Output Signals 



| Exterior 
<— Exterior 



Exterior — s 
Exterior J. 




| Exterior 
<— Exterior 



X 



Boundary w/ X — i 
Boundary w/ X | 




| Boundary w/ X 
<— Exterior 



X 



Interior w/ X 
Boundary J, 




| Exterior 

<— Boundary w/ X 



X 



Boundary — > 
Interior w/I| 




| Exterior 
<— Boundary 



Boundary 
Interior J. 




| Boundary 
<— Exterior 



Interior — » 
Boundary J, 




| Interior w/ X 
<— Interior w/ X 



X 



Interior — t 
Interior J. 



| Interior 
<— Interior 



Interior — i 
Interior J. 



(b) The transition table defining the finite signaling agent. 

TABLE I: These tables define a finite signaling agent 
which recognizes the 4-X Hamiltonian. 




X X 




(a) A term accepted by the agent 



I 














(b) A term rejected by the agent due to intersecting 
boundaries 

FIG. 16: Transitions experienced by an agent for two 
possible terms 



pattern to be rejected. 

Now that we have written down an agent that ac- 
cepts squares with 4 Xs, we see that we have immedi- 
ately obtained a factorization of the Hamiltonian which 
contains a term for each possible placement of these op- 
erators. This factorization — which can be thought of 
as a "projected entangled-pair operator" (i.e., the nat- 
ural generalization of projected entangled-pair states to 
represent operators) — is a tensor network with the ten- 
sors located at the grid points; links between nodes in- 



dicate that the corresponding indices of the two tensors 
should be summed over. The tensors at each node are 
of rank 6 - four of the dimensions correspond to the 
links, and two correspond to the physical quantum op- 
erator. There are only eight non-zero elements of this 
tensor, corresponding to the eight entries in Table [TbJ 
(0, 0, 0, 0, 1), (0, 0, 1, 1, X), (1, 0, 3, 2, X), etc. 



15 




FIG. 17: Tensor network giving the expectation of some 
operator with respect to the site shown in figure I14al 
Note that the index I connects three tensors. 



D. Calculation of expectations using recursion 

Assume that we have a tensor network state and a fac- 
torization of an operator which has the same network 
structure as the state. As in section Hi D[ we see that the 
expectation of the operator may be reduced to the con- 
traction of a network of "transfer matrices"; for example, 
for the peculiar state shown in section HlIBl calculating 
the expectation of our operator is equivalent to contract- 
ing a tensor network of the form shown in figure [T4al 8 

We may wish to minimize the energy with respect to 
some site n. As discussed in section Til El we can reduce 
this to an eigenvalue problem for the matrix consisting 
of the contraction of (essentially) all of the transfer ma- 
trices except the one at n. This contraction can be ex- 
pressed as a set of recursion rules; for example, for a 
two-dimensional grid we have the following rules: 





^i-ij ' ' B i>j 












= I 


Ri,j 


= ' Ci+lj'i 


Rn,j 


= I 




— ' B i.j ' B i,j 










A h i 


= I 


B i,3 


= Bij + \ ■ £/,-j_|_i, 




= I 



(The • operation is implicitly over only the connected 
indices.) 



Again, there is an alternative viewpoint that considers the trans- 
fer matrices themselves to be the primary object of interest. 
In particular, there is a generalization of finitely correlated 
states, known as "contour correlated states" |2'J . which char- 
acterizes quantum states on a two-dimensional lattice by a map 
E : A A ® B a - A -» B d + A that can be thought of as taking ob- 
servables in the lattice region A (members of a C*-algebra „4 A ) 
to a transfer matrix in the tensor space B 9 - A -> B a + A which 
transports information through the region A from its lower-left 
contour d—A to its upper-right contour d+ A. This lattice is dual 
to the lattice used by our agents, so the case where the region A 
is a single tile corresponds to a transfer matrix derived from an 
agent on the corresponding vertex in our lattice, with the minor 
difference that our agents use a different direction (upper-left to 
lower-right) for the flow of information on the lattice. 



L 33 



A 




FIG. 18: Use of recursion to calculate ^33. 



The computation of O33 is illustrated in figure [18] 

As was the case in section lTl El as long as we move from 
each site to an adjacent site, it takes us only (amortized) 
O(l) time to calculate ff^j. In Fig. [TH1 for example, we 
see that to compute 1^43 , we need only calculate C33 and 
then L43. 

Unfortunately, it is intractable to contract arbitrar- 
ily large multi-dimensional tensor networks. (Formally, 
Schuch et al. [23| have shown that this is a #P-complete 9 
problem.) This is because whenever one contracts to- 
gether tensors with more than two indices, one obtains a 
larger tensor. For example, when taking the dot-product 
between two four-index tensors one obtains a six-index 
tensor, 

^ ^ AgbcdBdef g — Cabcef g- 
f 

These extra indices result in "double-bonds" between 
tensors. (We have already seen multiple bonds when 
computing the E matrices, as shown in figure [TT]) 

Thus, as we contract each row, the size of our tensors 
increases by some factor, which means that the cost of 
contracting a tensor network in general grows exponen- 
tially with the size of the network! Fortunately, there 
is a lossy compression technique which involves approx- 
imating a row resulting from a contraction with a new 
row with fewer bonds; this has been used successfully 
to model hard-core bosons in a two-dimensional optical 
lattice [H] . 



9 #P-complete problems are the counting equivalent of NP- 
complete problems, and are widely thought to be computation- 
ally intractable. 
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IV. CONCLUSION 

In this paper, we have introduced a type of diagram 
for representing matrix product states. We used this 
to demonstrate that there is a formal equivalence be- 
tween matrix product states and operators and complex- 
weighted finite state automata. This equivalence was 
used to present a method by which one could factor a 
matrix product operator by reasoning about these com- 
plex weighted finite state automata. We then showed 
how such a matrix product factorization of an operator 
allows one to compute expectations of that operator in 
O(N) time, and to perform energy minimization at an 
amortized cost of only 0(1) per step. A generalization 
of this procedure was presented that allows one to carry 
the same process through for systems with more than one 
spatial dimensions. 

As a closing remark, we note that this formalism is in- 
teresting not only because of its practical application in 
simulating physical systems, but also because it relates 
our ability to efficiently simulate physical systems with 
a broader theory (the automata hierarchy or formal lan- 
guage theory) which deals with the fundamental limits 
of computation. It would be interesting to see whether 
there are other insights from this theory that could be 
used to improve techniques for simulating physical sys- 
tems. 
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(a) Matrix product diagram for "cat" state. 




(b) Nodes are rearranged in the middle of the diagram. 




(c) A connection is added which results in two additional walks 
that cancel each other. 



t 1 J »^ T 


' V. 
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(d) A node is inserted in the middle of the diagram. 

FIG. 19: The matrix product diagram is given for a 
4-particle "cat" state. Some transformations are 
demonstrated that result in equivalent representations 
for this state. 



APPENDIX A: VISUALIZATION OF MATRIX 
DEGREE OF FREEDOM 

There is a matrix "degree of freedom" that allows us to 
manipulate our state into equivalent forms. To visualize 
this process, we consider the 4-particle "cat" state, 



tttt + uu= 



T I 



t 




t 




T 


o 4 




o I 




I 



with the corresponding diagram shown in I19al 

The matrix degree of freedom comes from the fact that 
we may insert any matrix X and its inverse X^ 1 into this 
product without altering it: 



* = 



T I 



t I 



T o 


■x -x- 1 


t 
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i. 









j_ 


B 




c 


D 



T o 
o I 



■X 



x~ 



t 
o I 



B' 



This degree of freedom gives us many ways to alter 
our diagram to produce equivalent representations. For 
example, we may rearrange the nodes at any point in the 
diagram, as shown in figure I19bl which is equivalent to 
introducing the matrix 
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X = 



1 

1 



* = 



T 4 



o t 




o t 




1 


I o 




I o 




T 



We may also add additional connections between nodes 
in our diagrams, as long as the additional paths which 
result cancel each other out. So for example, we may 
modify our diagram to become as shown in I19cl Note 
how the paths introduces by the two new connections 
have opposite phases, and thus cancel. This change cor- 
responded to introducing the matrix 



a two-dimensional subspace. This is perfectly fine, how- 
ever, as in this case the third dimension (correspond- 
ing to the unused node) was actually redundant. It is 
obvious for this diagram that this was the case, but it 
might not be so obvious for a general diagram. Nonethe- 
less, numerically it is easy to eliminate such redundant 
nodes between two matrices by contracting the two ten- 
sors along their adjoining index and then using a singular 
value decomposition to split them back apart, dropping 
all (post-decomposition) vertices which correpond to zero 
singular values. 



X = 



1 1 
1 



* = 



T I 



"t f 




"t - 1" 




Y 







± . 




J. 



X need not be a square matrix, as long as the product 
X ■ X- 1 results in an identity matrix with the appropri- 
ate dimensions. This gives us the freedom to introduce 
empty nodes into our diagram, moving the connected 
noted apart, as with the matrix 



X 



1 
1 



which results in the diagram shown in l!9dl 

The fact that we moved them apart means that we 
should be able to move them back together again; this 
can be done with the matrix, 



X = 



1 

1 



This matrix does not have the property that X ■ X 1 
jives us an identity; rather, it gives us a projector onto 



APPENDIX B: PERIODIC BOUNDARY 
CONDITIONS 

Up to this point we have been implicitly assuming open 
boundary conditions on our system. In many situations, 
however, one wants to impose periodic boundary condi- 
tions. To do this, an additional bond is added between 
the first and last matrix: 



i,j,k,z 



AziB i: jCj k D kz . 



Our automata interpretation is modified as follows: 
rather than having initial and final distributions (i.e., 
starting states and ending states), we instead allow the 
automata to start on any state, but restrict it to only ac- 
cept strings which cause it to end on the state at which it 
started. This is equivalent to revising (0| to replace the 
initial distribution a and the final distribution 17 with a 
trace, 



/(a ai . . .ajv) = tr (W ao ■ W ai ■ ■ ■ W aN ) ■ 

This idea generalizes straightforwardly to multiple di- 
mensions. 
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